Feature Engineering Bookcamp by Sinan Ozdemir
Author:Sinan Ozdemir
Language: eng
Format: mobi
Publisher: Manning Publications Co.
Published: 2022-08-24T22:00:00+00:00
â¶ Using a custom tokenizer
â· Not needed anymore, as our tokenizer is removing stop words and is lowercasing
Our results (figure 5.19) show a reduction in performance, like we saw with our text cleaning.
Figure 5.19 Our stemmer is not showing a boost in performance, which implies that the tokens we were trying to remove had enough signal in them to lower our pipelineâs performance.
It looks like both of our feature improvement techniques did not show a boost in performance, but this is OK! They were both worth trying, and it reveals a deeper truth about our data.
Itâs tempting when working with text data to get frustrated when basic feature engineering techniques donât work, but context seems to really matter here, and this is often true in NLP cases. In our next few chapters, we will start to move away from interpretable features that represent individual tokens in our text and more towards latent featuresâfeatures that represent a hidden structure of data that is more complex than bag-of-words.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Computer Vision & Pattern Recognition | Expert Systems |
Intelligence & Semantics | Machine Theory |
Natural Language Processing | Neural Networks |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8310)
Test-Driven Development with Java by Alan Mellor(6831)
Data Augmentation with Python by Duc Haba(6749)
Principles of Data Fabric by Sonia Mezzetta(6489)
Learn Blender Simulations the Right Way by Stephen Pearson(6398)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6262)
Hadoop in Practice by Alex Holmes(5966)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5816)
RPA Solution Architect's Handbook by Sachin Sahgal(5663)
Big Data Analysis with Python by Ivan Marin(5411)
The Infinite Retina by Robert Scoble Irena Cronin(5354)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5163)
Pretrain Vision and Large Language Models in Python by Emily Webber(4381)
Infrastructure as Code for Beginners by Russ McKendrick(4148)
Functional Programming in JavaScript by Mantyla Dan(4044)
The Age of Surveillance Capitalism by Shoshana Zuboff(3964)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3863)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3662)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3640)
